LEAD at Unidata

Status Update, September 23, 2008

Mohan Ramamurthy and Tom Baltzer

[Excerpted from LEAD Annual Report, 8 Aug 2008]

Year-5 of LEAD in Review: Toward a Fully Functional System for Community Deployment

 

Nearing the end of its five year lifetime as an NSF Large ITR grant, LEAD has pioneered a new approach for integrating complex weather data, assimilation, modeling, mining, and cyberinfrastructure systems in innovative ways to empower researchers and students with capabilities heretofore available at only a few major universities and research or operational centers around the world.  The key point is that LEAD brings these capabilities – using a service-oriented architecture and other relevant technologies as the underpinning – to users with the simplicity of familiar environments such as Amazon.com and Travelocity.com.  By managing the complexity of inter-operative cyber tools and providing flexibility and ease in how they can be linked, LEAD allows students and researchers to focus their time on solving the science and engineering problems at hand, providing a means for more deeply understanding the tools and techniques being applied rather than the nuances of data formats, communication protocols, and job execution environments.  Containing virtually all elements of modern cyberinfrastructure – from adaptive sensors and high-performance computing and networking to huge data sets, human decision making and complex virtual organizations – LEAD functionality also has been integrated with the TeraGrid as a successful TeraGrid Science Gateway project and continues to serve as an avant-garde research system for the meteorological and computer science communities.  Indeed, as described throughout this report, LEAD has been a principal application driver for helping TeraGrid identify and solve some of its most important challenges and prepare for the next generation XD environment.

 

Transforming outcomes into impacts is a foundational goal of LEAD, and the service-oriented architecture being developed by LEAD for conducting its research continues to be enhanced and hardened, now operating around the clock to support students and researchers nationwide.  Following the successful pilot project associated with WxChallenge 2007, LEAD continued to make its resources available to the academic community.  For example, LEAD was used in METR 4133, a senior-level Mesoscale Meteorology class at the University of Oklahoma taught by K. Droegemeier during the fall, 2007 term.  In this course, students used the LEAD quasi-geostrophic learning module (developed by Millersville University) with real data and forecasts to understand competing effects of various forcing terms.  The LEAD tutorial was instrumental in helping this class of more than 50 students quickly learn how to navigate the many functions of LEAD and apply the Unidata IDV system to visualize a variety of atmospheric features and processes.  LEAD also was used in ESCI 342, a junior-level Dynamics I course, taught at MU in fall 2007, by R. Clark, and in ESCI 241, a sophomore-level Introductory Meteorology course (for majors), taught at MU in fall 2007 by S. Yalda.

 

A much different but equally important application of LEAD is in the development of next-generation forecasting systems that include both ensemble and dynamically adaptive, grid-enabled features.  Once again, LEAD was a significant component of the NOAA Hazardous Weather Test Bed in Norman, Oklahoma and its Spring 2008 Experiment (http://hwt.nssl.noaa.gov/Spring_2008/).  Building upon the success of the 2007 effort, the 2008 experiment utilized elements of the LEAD system, along with existing software at CAPS, to produce, on a daily basis for an approximately 7-week period, a 30-hour, 10-member, 4 km grid spacing, near-continental-US-scale WRF ensemble, a single 30-hour, 2 km grid spacing forecast over the same domain, as well as numerous limited-duration/limited-area forecasts at 2 km grid spacing triggered automatically in response to tornado watches and mesoscale discussions issued by the National Weather Service.  A notably significant addition in 2008 was the assimilation of NEXRAD Level II data from all radars in the Continental United States (CONUS) for use in initializing all WRF model forecasts.  This unprecedented capability provided operational forecasters and researchers with the ability to compare, in a real time operational setting, fine-scale deterministic and ensemble forecasts generated with and without NEXRAD data in the initial conditions. In some cases, the addition of radar data had profoundly positive impacts while in others, the impacts were decidedly negative.  Considerable effort therefore is being expended to analyze results from the 2008 Experiment and preliminary findings will be presented at upcoming conferences and in journal publications. 

 

It is important to note that the forecasts described above were supported by substantial, dedicated resources on the NSF TeraGrid at NCSA, PSC and Indiana University, and additional financial support was provide by NOAA via a C-STAR grant to the University of Oklahoma.  NCAR and NCEP also produced real time forecasts for the 2008 Experiment, but only CAPS and LEAD assimilated NEXRAD Level II radar data, produced ensembles, and conducted on-demand forecasts, automatically, in response to the weather.  None of these other organizations have such capability, which is a testimony to the importance and uniqueness of LEAD and the value of the TeraGrid in dynamically adaptive applications.  In fact, the 2008 Experiment placed tremendous quality of service demands on the TeraGrid, and the results were considerably improved relative to the TeraGrid’s performance in 2007.

 

Deploying LEAD as a National Facility for Atmospheric Science and Computer Science Research and Education

                                           

From the beginning, the LEAD vision has been to not only conduct excellent research and develop exciting and powerful technologies, but to do so in a practicable way that transforms meteorological research and education and has value to other disciplines.  Without question, that vision is being realized.  However, the transformation can occur only when LEAD is transitioned from a research project and made available as a persistent, stable facility upon which the community can rely.  This notion was expressly stated in the original LEAD proposal, with Unidata as the envisioned home for community deployment.

 

The outcomes and impacts realized by LEAD during its five years an ITR grant provide a strong foundation and high level of confidence upon which to build a persistent LEAD cyberinfrastructure.  Based upon considerable interest in LEAD by the atmospheric science community, as measured, for example, by the strong positive feedback following the recent workshops, LEAD seeks to develop a 5-year proposal to NSF that focuses on deployment/provisioning as well as on retaining selected components of the LEAD research enterprise, leveraging the tremendous collaborations now in place to ensure that LEAD remains at the forefront of capability.  This concept has the support of the UCAR and NCAR leadership as well as the Unidata Policy Committee. 

 

Our vision for the future is a primarily service-oriented environment of data streams, historical case study-type data sets, assimilation and modeling tools, mining and analysis engines, and visualization capabilities that are as pervasive in atmospheric, ocean and Earth science research and education as are desktop computers.

 

The LEAD vision also involves building upon growing education and outreach programs to extend the many LEAD resources into progressively lower grade levels and into communities for which even basic capabilities are unavailable.  Because weather is experienced by every human and is an excellent motivating factor for studying science, our vision includes using LEAD to stimulate interest and broaden participation in STEM (science, technology, education, mathematics) education at the grade levels where most students choose, sometimes unwittingly, to avoid science as a career.  Finally, the service-oriented approach to enabling research and education has shown so much promise that Microsoft Research is considering funding a two-year program to bring essential LEAD capabilities into the Windows Vista environment including its workflow and event communication systems.  This is a strong testimony to the potential value of LEAD, that the pathway taken has been appropriate, and that the successes to date reason for initiating formal deployment and sustaining LEAD as a national facility.  History has shown that the most widely used, transformative systems (e.g., DODS/OPeNDAP) require at least a decade of sustained investment and hardening before their impact is fully realized.

 

 

One-Year No-Cost Extension

 

In July, 2008, all nine LEAD institutions submitted and were granted a one-year no-cost extension (NCE), which will continue LEAD as an NSF cooperative agreement through 30 September 2009.

 

UPC staff will continue its role in the project and provide data, software and support for the project and valuable assistance in the testing and deployment of and end-user support for LEAD systems in the atmospheric sciences community. In addition, Unidata will 1) Maintain and upgrade the LEAD portals and grid administration software and computing cluster to improve reliability and user experience; 2) Observe, modify and re-balance the LEAD storage nodes to keep space and performance maximized; 3) Continue to work with LEAD personnel to identify and solve their technology problems. 4) Continue to maintain and operate the Access Grid node at Unidata that is principally used by LEAD.

 

 

The Unidata LEAD Test Bed Status

 

The Unidata LEAD test bed continues to be a primary resource of data for LEAD workflows. This includes:

 
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Figure.  Data now available, both historical (on disk) and in real time, within the LEAD and broader community data catalogs.

We are exploring how the Unidata community can benefit to an even greater extent from this important resource.

LEAD Tutorial at the Annual WRF Workshop

 

The LEAD project, lead by Unidata’s Tom Baltzer, presented a 90-minute tutorial as part of the annual WRF Workshop in Boulder, CO on June 27, 2008. It was attended by about 30 participants, most of whom brought a laptop and were able to access the LEAD Portal, construct and submit WRF based workflows and view results. An important capability made available by LEAD was namelist editing for the WRF model. Further, this was the first time LEAD had a presence at the WRF Workshop, and it represented an opportunity to explain how LEAD and the WRF Portal, being developed by NCAR and NOAA, differ, and how both groups are working to leverage and couple their capabilities.

 

In early June, flooding in Bloomington, Indiana caused severe power problems in the computer room where the output of LEAD workflows is housed. Thus, even basic capability did not exist three days prior to the tutorial. The IU team worked extremely hard to bring the system back online, but the long-term outage and quick fixes implemented did result in the failure of 50% of 80 workflows submitted by WRF Workshop participants (the system was still to fragile to handle the scale of 30 workflows being submitted). In general, participants had positive comments about LEAD and felt it provided important capabilities not otherwise available, including in the WRF Portal. Supplemental funding recommended for approval by NSF to partially interface capabilities of LEAD and the WRF Portal will yield important user benefits for both systems.

 

LEAD Beyond the ITR Phase

The LEAD PIs continue to strategize about securing funding for a continued LEAD deployment facility. (Continued LEAD CS research would be pursued under an OCI CDI initiative, though that would likely not involve Unidata.).

Several LEAD principal investigators, including Mohan Ramamurthy, are planning to visit NSF on 29-30 September 2008 to meet with NSF program officers and to discuss possible scenarios for deployment as well as continued development of various components of the LEAD cyberinfrastructure.  Despite repeated attempts to engage NSF officials in a dialog to discuss avenues for possibly continuing LEAD activities beyond the ITR phase, not much progress has been made for various reasons, including the transitioning of the project to a new NSF program official. The PIs sincerely hope that the conversations that initiated last year can be reinitiated that LEAD PIs and NSF will be able to engage in a dialog about extending LEAD activities beyond the period of performance for the current award, beyond the one year no-cost extension period.

LEAD Staff Departures

The uncertainty surrounding the future of LEAD and its funding, unfortunately, has resulted in the departure of two staff members, Anne Wilson (in May 2008) and Tom Baltzer (September 2008) who were funded by LEAD.  In light of the fact that LEAD will be in a no-cost extension period starting on 1 October, the UPC has decided that those positions will not be filled.  The UPC is working on a transition and is distributing the work around the existing staff.